Acoustic indicators of topic segmentation

نویسندگان

  • Julia Hirschberg
  • Christine H. Nakatani
چکیده

The segmentation of text and speech into topics and subtopics is an important step in document interpretation. For text, formatting information, such as headings and paragraphing, is available to aid in this endeavor, although this information is by no means su cient. For speech, the task is even more di cult. We present results of the application of machine learning techniques to the automatic identi cation of intonational phrases beginning and ending 'topics' determined independently by annotators for two corpora | the Boston Directions Corpus and the Broadcast News (HUB-4) DARPA/NIST database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input

We address the task of unsupervised topic segmentation of speech data operating over raw acoustic information. In contrast to existing algorithms for topic segmentation of speech, our approach does not require input transcripts. Our method predicts topic changes by analyzing the distribution of reoccurring acoustic patterns in the speech signal corresponding to a single speaker. The algorithm r...

متن کامل

Content-free Topic Segmentation with Acoustic Features (Report)

In my previous work, content-free topic segmentation is approached by classification methods, and the unit is Vocalization [6]. Speaker ID, vocalization start time, vocalization duration, pause, overlaps and their corresponding Horizon features are emphasized. This followed an approach to segmentation and classification introduced by Luz [2, 3] for analysing recordings of multidisciplinary medi...

متن کامل

A Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling

In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...

متن کامل

A Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling

In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...

متن کامل

Discourse Segmentation of Multi-Party Conversation

We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based algorithm combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech. This segmentation algorithm uses automatically induced decision rules to combine the different features. The embedded...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998